Serbian Spa Waters

Group 6

  • Elizabeth
  • Claude
  • Leah
  • Harry

Introduction

  • Analysis of hydrochemical and radiological data of mineral and thermal waters in Serbia.
  • Original authors used PCA and HCA to classify the waters with respect to their geotectonic units.
  • Factor analysis was used to try to improve upon the original results.

About the dataset

  • 30 observations
  • 1 categorical variable
    • Samples collected from four geological structures:
      • Hydrogeological basins
      • Karstic terrains
      • Volcanogenic massifs
      • Metamorphic regions

12 numerical variables

Variable Description Units
T Temperature \(^\circ\)C
pH pH level (Acidity/Alkalinity)
EC Electrical conductivity \(\mu\)S/cm
TS Total disolved solids g/L
Ca\(^{2+}\) Calcium mg/L
Mg\(^{2+}\) Magnesium mg/L

12 numerical variables

Variable Description Units
Na\(^{+}\) Sodium mg/L
K\(^{+}\) Potassium mg/L
Cl\(^{-}\) Chlorine mg/L
SO\(^{2-}_4\) Sulfate mg/L
HCO\(^{-}_3\) Bicarbonate mg/L
SiO\(_2\) Silica, dissolved silicon dioxide mg/L

Observations per geological structure

Geological Structure Number of Observations
Hydrogeological Basins 5
Karstic Terrains 5
Volcanogenic Massifs 14
Metamorphic Regions 6
  • Unbalanced data set

Correlation Heatmap

Checking for multivariate normality

  • Assumption of multivariate normality not satisfied
  • Log transformations needed

Re-checking for normality

PCA

PCA

PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
Proportion of Variance 0.4841385 0.1793697 0.1083222 0.0779122 0.0608540 0.0339688 0.0245024 0.0137945
Cumulative Proportion 0.4841385 0.6635082 0.7718303 0.8497425 0.9105965 0.9445654 0.9690677 0.9828622

PCA

PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
log.tempCels -0.023 -0.329 0.521 0.302 -0.653 -0.015 -0.044 0.284
log.pH -0.304 -0.294 -0.187 0.077 0.349 0.314 -0.186 0.707
log.elec.Cond 0.378 -0.197 -0.089 0.009 0.077 -0.275 -0.224 0.153
log.totSolid 0.367 -0.212 -0.172 -0.045 0.041 -0.324 -0.212 0.117
log.Ca2 0.187 0.551 0.083 0.065 -0.02 0.394 0.251 0.33
log.Mg2 0.277 0.438 0.049 -0.14 -0.03 -0.398 -0.02 0.417
log.Na 0.345 -0.301 -0.102 0.102 0.151 0.318 0.073 -0.213
log.K 0.376 0.041 0.14 -0.08 -0.103 0.474 -0.196 -0.113
log.Cl 0.271 -0.076 -0.125 0.676 0.16 -0.111 0.525 0.046
log.SO2 0.053 0.06 0.717 0.163 0.597 -0.068 -0.25 -0.094
log.HCO 0.388 -0.004 -0.118 -0.053 -0.124 0.253 -0.302 0.062
log.SiO 0.174 -0.353 0.27 -0.611 0.11 0.014 0.577 0.163

PC2 vs. PC1

Factor Analysis

Confusion matrix